-
Notifications
You must be signed in to change notification settings - Fork 338
[CI/Build] Upgrade CANN to 8.2.RC1 #1653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1653 +/- ##
==========================================
- Coverage 73.16% 73.11% -0.06%
==========================================
Files 90 90
Lines 9929 9929
==========================================
- Hits 7265 7260 -5
- Misses 2664 2669 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consdier the CANN alpha3 release bring many oom and internal error, we will not ugprade this
RuntimeError: replay:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:201 NPU function error: c10_npu::acl::AclmdlRIExecuteAsync(model_ri_, c10_npu::getCurrentNPUStream()), error code is 507000
[2] OOM as unexpected:
https://github.com/vllm-project/vllm-ascend/actions/runs/16133440759/job/45525106780?pr=1653
RuntimeError: NPU out of memory. Tried to allocate 98.00 MiB (NPU 0; 29.50 GiB total capacity; 1010.15 MiB already allocated; 1010.15 MiB current active; 39.48 MiB free; 1.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it? Upgrade CANN to 8.2.rc1 Backport: #1653 ### Does this PR introduce _any_ user-facing change? Yes, docker image will use 8.2.RC1 ### How was this patch tested? CI passed --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
https://github.com/vllm-project/vllm-ascend/blob/main/.github%2FDockerfile.buildwheel should also updated, hold on this until ci infra upgrade. |
It seems manylinux image not ready yet, let's first upgrade cann image first |
What this PR does / why we need it?
Upgrade CANN to 8.2.rc1
Backport: #1653
Does this PR introduce any user-facing change?
Yes, docker image will use 8.2.RC1
How was this patch tested?
CI passed